Species Distribution Modeling of Citizen Science Data as a Classification Problem with Class-Conditional Noise

نویسندگان

  • Rebecca A. Hutchinson
  • Liqiang He
  • Sarah C. Emerson
چکیده

Species distribution models relate the geographic occurrence pattern of a species to environmental features and are used for a variety of scientific and management purposes. One source of data for building species distribution models is citizen science, in which volunteers report locations where they observed (or did not observe) sets of species. Since volunteers have variable levels of expertise, citizen science data may contain both false positives and false negatives in the location labels (present vs. absent) they provide, but many common modeling approaches for this task do not address these sources of noise explicitly. In this paper, we propose to formulate the species distribution modeling task as a classification problem with class-conditional noise. Our approach builds on other applications of class-conditional noise models to crowdsourced data, but we focus on leveraging features of the noise processes that are distinct from the class features. We describe the conditions under which the parameters of our proposed model are identifiable and apply it to simulated data and data from the eBird citizen science project.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

Robust Portfolio Optimization with risk measure CVAR under MGH distribution in DEA models

Financial returns exhibit stylized facts such as leptokurtosis, skewness and heavy-tailness. Regarding this behavior, in this paper, we apply multivariate generalized hyperbolic (mGH) distribution for portfolio modeling and performance evaluation, using conditional value at risk (CVaR) as a risk measure and allocating best weights for portfolio selection. Moreover, a robust portfolio optimizati...

متن کامل

روشی جدید برای عضویت‌دهی به داده‌ها و شناسایی نوفه و داده‌های پرت با استفاده از ماشین بردار پشتیبان فازی

Support Vector Machine (SVM) is one of the important classification techniques, has been recently attracted by many of the researchers. However, there are some limitations for this approach. Determining the hyperplane that distinguishes classes with the maximum margin and calculating the position of each point (train data) in SVM linear classifier can be interpreted as computing a data membersh...

متن کامل

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. In this work, we give conditions that are necessary and sufficient for...

متن کامل

An application of Measurement error evaluation using latent class analysis

‎Latent class analysis (LCA) is a method of evaluating non sampling errors‎, ‎especially measurement error in categorical data‎. ‎Biemer (2011) introduced four latent class modeling approaches‎: ‎probability model parameterization‎, ‎log linear model‎, ‎modified path model‎, ‎and graphical model using path diagrams‎. ‎These models are interchangeable‎. ‎Latent class probability models express l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017